Skip to content

Add scanned_manifests_path metadata to snapshots#14406

Merged
brrygrdn merged 2 commits into
mainfrom
brrygrdn/dg-9769-attach-scanned-manifest-path-metadata
Mar 12, 2026
Merged

Add scanned_manifests_path metadata to snapshots#14406
brrygrdn merged 2 commits into
mainfrom
brrygrdn/dg-9769-attach-scanned-manifest-path-metadata

Conversation

@brrygrdn
Copy link
Copy Markdown
Contributor

@brrygrdn brrygrdn commented Mar 10, 2026

What are you trying to accomplish?

We have started to track the snapshots received vs those requested from Dependabot in the service using the manifest[].ecosystem and manifest[].file.source_location but this only allows us to track 'happy path' outcomes.

In case where a manifest has been deleted or there was a fatal error processing the project, we will have an empty manifests collection in the snapshot.

To solve this problem, let's start adding an explicit scanned_manifest_path inventory to the metadata that will always be populated regardless of the snapshot outcome.

Anything you want to highlight for special attention from reviewers?

This is using the current pattern of attaching these values to the snapshots top-level metadata key, in future we will extend the JSON definition with a Dependabot-specific object to validate and track the values a little more concretely, but for now this is good enough.

How will you know you've accomplished your goal?

I will see the new scanned_manifest_path appear in any snapshots I generate from my test repos.

I will also see the smoke tests for graph jobs fail on this PR until I generate an updated test expectation.

Checklist

  • I have run the complete test suite to ensure all tests and linters pass.
  • I have thoroughly tested my code changes to ensure they work as expected, including adding additional tests for new functionality.
  • I have written clear and descriptive commit messages.
  • I have provided a detailed description of the changes in the pull request, including the problem it addresses, how it fixes the problem, and any relevant details about the implementation.
  • I have ensured that the code is well-documented and easy to understand.

@brrygrdn brrygrdn requested a review from a team as a code owner March 10, 2026 16:03
Copilot AI review requested due to automatic review settings March 10, 2026 16:03
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new scanned_manifest_paths metadata field to dependency graph snapshots. This field is always populated (unlike the manifests property which can be empty on errors), allowing the downstream snapshot service to verify which manifests were scanned regardless of the snapshot's outcome (success, failure, degraded, or skipped).

Changes:

  • Added a scanned_manifest_paths private method to DependencySubmission that returns an array containing the ecosystem and directory path of the scanned manifest.
  • Included the new scanned_manifest_paths field in the snapshot's metadata hash within the payload method.
  • Added test assertions in both dependency_submission_spec.rb and update_graph_processor_spec.rb to verify the new metadata field across all snapshot status scenarios (success, failed, skipped, degraded, subdirectory).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
updater/lib/github_api/dependency_submission.rb Adds scanned_manifest_paths method and includes it in the payload metadata
updater/spec/github_api/dependency_submission_spec.rb Adds test assertion for the new metadata field in the "generates submission metadata correctly" test
updater/spec/dependabot/update_graph_processor_spec.rb Adds test assertions for the new metadata field across 4 snapshot status scenarios

Comment thread updater/lib/github_api/dependency_submission.rb Outdated
Comment thread updater/lib/github_api/dependency_submission.rb Outdated
@brrygrdn brrygrdn force-pushed the brrygrdn/dg-9769-attach-scanned-manifest-path-metadata branch from 02f9c81 to a799ef5 Compare March 10, 2026 17:57
@brrygrdn brrygrdn marked this pull request as draft March 11, 2026 13:08
@brrygrdn
Copy link
Copy Markdown
Contributor Author

brrygrdn commented Mar 11, 2026

I may need to rethink this as I don't think this is a valid use of metadata per the API docs, let me revise.

We are being strict about the validation of the metadata as just string keys, so I'm going to work within those constraints for now.

This does lock us in a little one one path == one snapshot but it isn't the only thing that is locking us in on that right now and we have a tech debt item to move our usage of metadata to a Dependabot 'receipt' object at some point so I'm not digging a deeper trench with this approach.

@brrygrdn brrygrdn force-pushed the brrygrdn/dg-9769-attach-scanned-manifest-path-metadata branch from 3ecef0a to 7af435d Compare March 11, 2026 17:43
Comment on lines +191 to +200
# Returns a synopsis of the scan performed in the format `ecosystem::manifest_path`, e.g.
# - `golang::/`
# - `rubygems::rails_app/`
#
sig do
returns(String)
end
def scanned_manifest_path
"#{GithubApi::EcosystemMapper.ecosystem_for(package_manager)}::#{File.dirname(manifest_file.path)}"
end
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 We could use the job_correlator for this but it has some handling to ensure we deterministically hash long paths that I don't want to undo as it would change how historical snapshots are compared using these keys.

Rather than overload the responsibility of this value, which is firmly in the snapshot domain, with the requirement to track the job 'receipt' of a snapshot, let's keep these isolated.

Comment on lines +111 to +117
# TODO: Move use of metadata to a Dependabot-specific object
#
# We are using the existing job metadata as a bag-of-values for error handling
# and job tracking that is specific to Dependabot-created submissions.
#
# In future, we should extend the public API schema with a validated object to
# harden this contract.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are tracking this internally, I'm just adding a note here to make sure we keep line of sight that using metadata as a bag of values is something we will need to migrate off of in future.

The values we store here are all of short-term relevant, e.g. they only matter to the actual POST request that submits the snapshot or for state reporting when this snapshot is on the tip of a branch so we don't need to worry about backfilling or handling historical values.

@brrygrdn brrygrdn marked this pull request as ready for review March 11, 2026 17:56
@brrygrdn brrygrdn force-pushed the brrygrdn/dg-9769-attach-scanned-manifest-path-metadata branch from 7af435d to 0c3042b Compare March 12, 2026 11:39
@brrygrdn brrygrdn merged commit e52f56d into main Mar 12, 2026
166 of 167 checks passed
@brrygrdn brrygrdn deleted the brrygrdn/dg-9769-attach-scanned-manifest-path-metadata branch March 12, 2026 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants